Skip to content

feat: don't split text inside ruby transcriptions#2795

Merged
birtles merged 16 commits intomainfrom
dont-split-rt-text
Apr 8, 2026
Merged

feat: don't split text inside ruby transcriptions#2795
birtles merged 16 commits intomainfrom
dont-split-rt-text

Conversation

@birtles
Copy link
Copy Markdown
Member

@birtles birtles commented Feb 26, 2026

Fixes #2785

@birtles birtles force-pushed the dont-split-rt-text branch from 9a71097 to c52b45a Compare April 8, 2026 03:35
@birtles birtles force-pushed the dont-split-rt-text branch from d19f18c to b9ced9f Compare April 8, 2026 04:45
@birtles birtles marked this pull request as ready for review April 8, 2026 06:16
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 18f77cdf9b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/background/jpdict.ts Outdated
Comment thread src/content/scan-text.ts
@birtles
Copy link
Copy Markdown
Member Author

birtles commented Apr 8, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7d8ef1da3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/background/jpdict.ts
> {
let [word, inputLengths] = normalizeInput(input);
noSplitMask = remapNoSplitMaskToNormalized({ inputLengths, noSplitMask });
noSplitMask = addYoonToNoSplitMask({ input: word, noSplitMask });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compute yoon no-split mask after fallback kana conversion

When the words DB falls back to the flat-file backend, word is converted from katakana to hiragana (kanaToHiragana) later in this function, but the yoon mask is computed earlier from the pre-conversion text. That means katakana inputs like キャ can lose yoon protection in fallback mode and get shortened to , which is a behavior regression from the previous endsInYoon logic (it ran after conversion inside wordSearch). This affects lookup accuracy specifically while the IndexedDB words DB is unavailable/updating.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better still, we should make isYoon work on katakana キャ etc.

Comment thread src/content/scan-text.ts
@birtles birtles enabled auto-merge (squash) April 8, 2026 07:14
@birtles birtles merged commit 3a9d303 into main Apr 8, 2026
2 checks passed
@birtles birtles deleted the dont-split-rt-text branch April 8, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

single segments of ruby text shouldn't be broken up

1 participant